Using word latice information for a tighter coupling in speech translation systems
نویسندگان
چکیده
In this paper we present first experiments towards a tighter coupling between Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT) to improve the overall performance of our speech translation system. In coventional speech translation systems, the recognizer outputs a single hypothesis which is then translated by the SMT system. This approach has the limitation of being largely dependent on the word error rate of the first best hypothesis. The word error rate is typically lowered by generating many alternative hypotheses in the form of a word lattice. The information in the word lattice and the scores from the recognizer can be used by the translation system to obtain better performance. In our experiments, by switching from the single best hypotheses to word lattices as the interface between ASR and SMT, and by introducing weighted acoustic scores in the translation system, the overall performance was increased by 16.22%.
منابع مشابه
Using Word Lattice Information for a Tighter Coupling in Speech Translation Systems
In this paper we present first experiments towards a tighter coupling between Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT) to improve the overall performance of our speech translation system. In coventional speech translation systems, the recognizer outputs a single hypothesis which is then translated by the SMT system. This approach has the limitation of being l...
متن کاملth Jeju Island , Korea October 4 - 8 , 2004
In this paper we present first experiments towards a tighter coupling between Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT) to improve the overall performance of our speech translation system. In coventional speech translation systems, the recognizer outputs a single hypothesis which is then translated by the SMT system. This approach has the limitation of being l...
متن کاملCombining natural language processing systems to improve machine translation of speech
Machine translation of spoken language is a challenging task that involves several natural language processing (NLP) software modules. Human speech in one natural language has to be first automatically transcribed by a speech recognition system. Next, the transcription of the spoken utterance can be translated into another natural language by a machine translation system. In addition, it may be...
متن کاملA Coarse-Grained Model for Optimal Coupling of ASR and SMT Systems for Speech Translation
Speech translation is conventionally carried out by cascading an automatic speech recognition (ASR) and a statistical machine translation (SMT) system. The hypotheses chosen for translation are based on the ASR system’s acoustic and language model scores, and typically optimized for word error rate, ignoring the intended downstream use: automatic translation. In this paper, we present a coarset...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004